# 36 Diagonalization: Part 1 - Eigenvalues, eigenvectors, characteristic equations.

Date modified: Mon 2024-05-06 . 11:56 AM
Date created: Fri 2024-05-03 . 10:24 AM
# 36 Diagonalization: Part 1 - Eigenvalues, eigenvectors, characteristic equations. ## Diagonalizability. At this point we know for a fixed $n$, $n \times n$ square matrices can be classified by **similarity**, where $A \stackrel{\text{sim}}\sim B$ if they "are linear maps that do similar things". More precisely, $A \stackrel{\text{sim}}\sim B$ if only if $A = P B P^{-1}$ for some invertible matrix $P$, and that the column vectors in $P$ forms a basis where in that coordinate system, $B$ is doing the same thing has $A$. What we like to investigate is a special class of matrices called **diagonalizable matrices:** > **Definition.** > A square matrix $A$ of size $n \times n$ is said to be **diagonalizable** if $A$ is similar to some $n \times n$ diagonal matrix $D$. In particular $A = P D P^{-1}$ for some invertible matrix $P$ and diagonal matrix $D$. Note, a diagonal matrix $D$ has the form $$ D = \begin{pmatrix}\lambda_{1}\\&\lambda_{2}\\&&\ddots\\&&&\lambda_{n}\end{pmatrix}, $$where it is necessarily square, and where every entry not on the diagonal are zero. The diagonal entries themselves can be zero. For short, sometimes we write $D =\text{diag}(\lambda_{1},\lambda_{2},\ldots,\lambda_{n})$ to denote a diagonal matrix $D$ whose diagonal entries are $\lambda_{1},\lambda_{2},\ldots,\lambda_{n}$ in order. **Example.** Let us consider the matrices $$ A = \begin{pmatrix}1 & 2\\2 & 1\end{pmatrix}\quad\text{and}\quad B = \begin{pmatrix}1 & 2\\0 & 1\end{pmatrix} $$ I claim that $A$ is diagonalizable because $A = P D P^{-1}$ for $$ P = \begin{pmatrix}1 & 1\\1 & -1\end{pmatrix} \quad\text{and}\quad D = \begin{pmatrix}3 & 0\\ 0 & -1\end{pmatrix} $$which you can verify. We will talk about how I come up with these matrices $P$ and $D$ later. So here $A$ is diagonalizable because $A \stackrel{\text{sim}}\sim D$. However, $B$ is not diagonalizable. Let us try to show this from first principles. Suppose $B$ is similar to some diagonal matrix $D$ where $$ D = \begin{pmatrix}\lambda_{1} & 0 \\ 0 & \lambda_{2}\end{pmatrix} $$then there is some invertible matrix $P$ such that $B = PDP^{-1} \iff B P = P D$. Write $$ P = \begin{pmatrix}a & b\\c & d\end{pmatrix}, $$then we have $$ B P = P D \implies \begin{pmatrix}1 & 2\\0 & 1\end{pmatrix} \begin{pmatrix}a & b\\c & d\end{pmatrix} = \begin{pmatrix}a & b\\c & d\end{pmatrix} \begin{pmatrix}\lambda_{1}\\ & \lambda_{2}\end{pmatrix} $$which gives a system $$ \begin{align*} a + 2c &= \lambda_{1} a \\ b + 2d &= \lambda_{2} b \\ c &= \lambda_{1} c \\ d &= \lambda_{2} d \end{align*} \implies \begin{array}{} (1-\lambda_{1})a + 2c &= 0 \\ (1-\lambda_{2})b + 2d &= 0 \\ (1-\lambda_{1})c &= 0 \\ (1-\lambda_{2})d &= 0 \end{array} $$ This is an interesting situation with cases, which depends on whether $\lambda_{1}$ or $\lambda_{2}$ equal to 1 or not. Let us look at $\lambda_{1}$. If $\lambda_{1} \neq 1$, then $1-\lambda_{1} \neq 0$, which means $c = 0$. And if $c = 0$, this gives equation $(1-\lambda_{1})a = 0$. But $1-\lambda_{1} \neq 0$, so this further makes $a=0$. So, our matrix $P$ look like $$ P = \begin{pmatrix}a & b\\c & d\end{pmatrix} = \begin{pmatrix}0 & \ast \\ 0 & \ast\end{pmatrix} $$This $P$ is not invertible. So we must have $\lambda_{1} = 1$. Let us look at $\lambda_{2}$. If $\lambda_{2} \neq 1$, then $1-\lambda_{2}\neq 0$, which means $d = 0$. This in turn forces $b =0$. With these our matrix $P$ looks like $$ P = \begin{pmatrix}a & b\\c & d\end{pmatrix} = \begin{pmatrix}\ast & 0\\\ast & 0\end{pmatrix} $$and it is not invertible. So we must have $\lambda_{2}=1$. But if $\lambda_{1}=\lambda_{2}=1$, our system of equation reads $$ \begin{array}{} 2c = 0 \\ 2d = 0 \\ 0 = 0 \\ 0 = 0 \end{array} $$But $c=d = 0$ implies $$ P = \begin{pmatrix}a & b\\c & d\end{pmatrix} = \begin{pmatrix}\ast & \ast\\0 & 0\end{pmatrix} $$which again is not invertible! In conclusion, $B$ is not diagonalizable! $\blacklozenge$ **Important Remark.** In general, we can always use the definition of similarity to show whether or not a matrix is diagonalizable or not, however it can be lengthy especially when the matrix is large. Later we will see how to determine diagonalizability more effectively. ## Eigenrelations. Let us investigate what happens when a square matrix is diagonalizable. In the following, every matrix is $n\times n$. Suppose matrix $A$ is diagonalizable, then $A \stackrel{\text{sim}}\sim D=\text{diag}(\lambda_{1},\lambda_{2},\ldots,\lambda_{n})$, where there is some invertible matrix $P$ such that $$ A = P D P^{-1} \iff AP = P D. $$Suppose the columns of $P$ are $\vec v_{1},\vec v_{2},\ldots,\vec v_{n}$, with $$ P = \begin{pmatrix}| & | & & |\\ \vec v_{1} & \vec v_{2} & \cdots & \vec v_{n} \\ | & | & & |\end{pmatrix}. $$ Notice that $\vec v_{1},\vec v_{2},\ldots,\vec v_{n}$ forms a basis because $P$ is an invertible matrix. Then $A P = P D$ says $$ A \begin{pmatrix}| & | & & |\\ \vec v_{1} & \vec v_{2} & \cdots & \vec v_{n} \\ | & | & & |\end{pmatrix} = \begin{pmatrix}| & | & & |\\ \vec v_{1} & \vec v_{2} & \cdots & \vec v_{n} \\ | & | & & |\end{pmatrix} \begin{pmatrix}\lambda_{1}\\ & \lambda_{2}\\ & &\ddots \\ & & & \lambda_{n}\end{pmatrix} $$which if we recall matrix-vector product for the left side, and how diagonal matrices behave in matrix product on the right, we get $$ \begin{pmatrix}| & | & & |\\ A\vec v_{1} & A\vec v_{2} & \cdots & A\vec v_{n} \\ | & | & & |\end{pmatrix} = \begin{pmatrix}| & | & & |\\ \lambda_{1} \vec v_{1} & \lambda_{2} \vec v_{2} & \cdots & \lambda_{n} \vec v_{n} \\ | & | & & |\end{pmatrix}. $$This says for each column $i$, we have the relation $$ A\vec v_{i} = \lambda_{i} \vec v_{i} $$ This is what we called an **eigenrelation.** Note $\vec v_{i} \neq\vec 0$ because $\vec v_{i}$ is a column of an invertible matrix. In general, we have the following definition > If $\vec v\neq0$ then $$ A \vec v = \lambda \vec v $$is said to be an eigenrelation. Here $\lambda$ can be $0$. In other words, what we have discovered is the following: > If an $n\times n$ matrix $A$ is diagonalizable, then there is a basis of vectors $\vec v_{1},\vec v_{2},\ldots,\vec v_{n}$ and numbers $\lambda_{1},\lambda_{2},\ldots,\lambda_{n}$ such that they satisfy the eigenrelations $$ A \vec v_{i} = \lambda_{i} \vec v_{i} $$ for each $i$. This is a profound observation. This tells us that to see if a matrix $A$ is diagonalizable or not, we should try to find this basis of vectors $\vec v_{i}$ and these numbers $\lambda_{i}$ satisfying eigenrelations $A \vec v_{i} = \lambda_{i} \vec v_{i}$. ## Eigenvalues and eigenvectors. We now make more definitions. > For a square matrix $A$, if a nonzero vector $\vec v \neq\vec 0$ and any scalar $\lambda$ satisfies the relation $$ A \vec v = \lambda \vec v $$ then we say they satisfy an **eigenrelation**. And we say that $\lambda$ is an **eigenvalue** of $A$, and that $\vec v$ is an **eigenvector** of $A$ associated with the eigenvalue $\lambda$. > An important note: An eigenvector $\vec v$ cannot be $\vec 0$, however the eigenvalue $\lambda$ can be zero. Geometrically, if $\vec v$ is an eigenvector of $A$, where $A\vec v = \lambda \vec v$, then the action of $A$ on $\vec v$ is scaling $\vec v$ by the scalar $\lambda$. **Example.** Consider the matrix $$ A = \begin{pmatrix}1 & 2 & 3\\6 & 0 & 0 \\ 1 & 2 & 3\end{pmatrix} $$ and the vectors $$ \vec v = \begin{pmatrix}1\\1\\1\end{pmatrix}, \vec w = \begin{pmatrix}0\\3 \\-2\end{pmatrix}, \vec u=\begin{pmatrix}1\\2\\1\end{pmatrix}, \vec z = \begin{pmatrix}0\\0\\0\end{pmatrix} $$which ones are eigenvectors of $A$ ? And if so, what is the associated eigenvalue? $\blacktriangleright$ We compute $$ A\vec v = \begin{pmatrix}1 & 2 & 3\\6 & 0 & 0\\1 & 2 & 3\end{pmatrix}\begin{pmatrix}1\\1\\1\end{pmatrix} = \begin{pmatrix}6\\6\\6\end{pmatrix} = 6\begin{pmatrix}1\\1\\1\end{pmatrix} $$which shows $A \vec v = 6 \vec v$, so we see that $\vec v$ is an eigenvector of $A$ with eigenvalue $6$.. Next, we compute $$ A \vec w = \begin{pmatrix}1 & 2 & 3\\6 & 0 & 0\\1 & 2 & 3\end{pmatrix}\begin{pmatrix}0\\3\\-2\end{pmatrix} = \begin{pmatrix}0\\0\\0\end{pmatrix} = 0\begin{pmatrix}0\\3\\-2\end{pmatrix} $$which shows $A\vec w = 0\vec w$, so $\vec w$ is an eigenvector of $A$ with eigenvalue $0$. Next we have $$ A \vec u = \begin{pmatrix}1 & 2 & 3\\6 & 0 & 0\\1 & 2 & 3\end{pmatrix}\begin{pmatrix}1\\2\\1\end{pmatrix} = \begin{pmatrix}7\\6\\7\end{pmatrix} $$which is not a scalar multiple of $\vec u$, so $\vec u$ is **not** an eigenvector of $A$. Lastly, the vector $\vec z = \vec 0$ is the zero vector, which by definition is not an eigenvector of anything. $\blacklozenge$ ## Characteristic polynomial. How can we find all the eigenvalues and eigenvectors of a square matrix then? We start with eigenvalues. Note if we have eigenrelation $A \vec v = \lambda \vec v$ for some nonzero vector $\vec v$ and eigenvalue $\lambda$, we can make the following equivalent logical deduction: $$ \begin{array}{} A \vec v = \lambda \vec v \quad \text{for $\vec v \neq \vec 0$} \\ \iff \\ A \vec v - \lambda \vec v = \vec 0 \quad \text{for $\vec v \neq \vec 0$} \\ \iff \\ (A - \lambda I)\vec v = \vec 0 \quad \text{for $\vec v \neq \vec 0$} \\ \iff \\ \vec v \in \text{ker}(A-\lambda I) \quad \text{for $\vec v \neq \vec 0$} \\ \iff \\ \text{the equation $(A-\lambda I)\vec v = \vec 0$ has nontrivial solution $\vec v\neq\vec 0 $} \\ \iff \\ A-\lambda I \text{ is not invertible} \\ \iff \\ \det(A-\lambda I) = 0 \end{array} $$ This deduction is by using our understanding of the invertible matrix theorem. In particular we deduced the following > A scalar $\lambda$ is an eigenvalue of $A$ if and only if $$ A - \lambda I \text{ is not invertible} $$ and if and only if $$ \det(A-\lambda I) = 0. $$ This gives us a method to deduce eigenvalues of a square matrix! In particular, for any square matrix $A$, we define the polynomial $$ p_{A}(t) = \det(A-tI) $$to be the **characteristic polynomial** of $A$, and by what we saw above, we see that > For a square matrix $A$, a number $\lambda$ is an eigenvalue of $A$ if and only if $\lambda$ is a root of the characteristic polynomial $p_{A}(t)=\det(A-t I)$, where $p_{A}(\lambda) = 0$. Computationally, $A-tI$ is just the matrix $A$ where you subtract $t$ from each of the diagonal entry, so $$ \det(A-tI) = \det \begin{pmatrix}\ast-t & \ast & \ast & \cdots & \ast\\\ast & \ast-t & \ast & \cdots & \ast \\ \ast & \ast & \ast-t & & \ast \\ \vdots \\ \ast & \ast & \ast & \cdots & \ast -t\end{pmatrix}. $$ **Example.** Consider the matrix $$ A = \begin{pmatrix}1 & 2 & 3\\6 & 0 & 0\\1 & 2 & 3\end{pmatrix}. $$ Its characteristic polynomial is $$ \begin{align*} p_{A}(t) & = \det(A-t I) = \det \begin{pmatrix}1-t & 2 & 3\\6 & -t & 0\\1 & 2 & 3-t\end{pmatrix} \\ &=(1-t)(-t)(3-t)+36 - (-3t)-12(3-t) \\ &= - t^3 + 4 t^2 + 12 t \\ &= - t(t-6)(t+2) \end{align*} $$So we see the roots of $p_{A}$ are $0,6,-2$, so the eigenvalues of $A$ are $0,6,-2$. $\blacklozenge$ **Remark.** When we compute the expression $$ \det(A-t I) $$we get a polynomial in $t$ because recall when computing determinants we only need to perform addition, subtraction of the entries of $A- tI$, and there is no division involved. In some literature, one may see the characteristic polynomial of $A$ defined as $\det(tI-A)$ instead. In terms of finding eigenvalues, either versions would be fine, however the actual polynomials would be off by a multiplicative factor of $(-1)$ if the size of $A$ is odd. For our class, let us stick with the definition $p_{A}(t) = \det(A-tI)$ for consistency. **Remark.** It would be good to review how to factor basic polynomials. I would recommend you review **long division** / **synthetic division** to factorize a polynomial once you know one of its roots. As well as **rational root test** to try to find potential rational roots. But in general, finding roots of a generic polynomial is not as simple, and it is one main "hurdle" of linear algebra. In practice, these roots, whence eigenvalues of a matrix, are approximated. We will continue our story of diagonalization in the next parts, where we find these eigenvectors. We will encounter the idea of algebraic and geometric multiplicity, eigenspaces, and lastly see some magical relations between eigenvalues. And end with some applications of diagonalization.